Scalable Constrained Clustering: A Generalized Spectral Method
نویسندگان
چکیده
We present a principled spectral approach to the wellstudied constrained clustering problem. It reduces clustering to a generalized eigenvalue problem on Laplacians. The method works in nearly-linear time and provides concrete guarantees for the quality of the clusters, at least for the case of 2-way partitioning. In practice this translates to a very fast implementation that consistently outperforms existing spectral approaches. We support this claim with experiments on various data sets: our approach recovers correct clusters in examples where previous methods fail, and handles data sets with millions of data points two orders of magnitude larger than before.
منابع مشابه
Simple and Scalable Constrained Clustering: a Generalized Spectral Method
We present a simple spectral approach to the well-studied constrained clustering problem. It captures constrained clustering as a generalized eigenvalue problem with graph Laplacians. The algorithm works in nearly-linear time and provides concrete guarantees for the quality of the clusters, at least for the case of 2-way partitioning. In practice this translates to a very fast implementation th...
متن کاملRobust and Efficient Computation of Eigenvectors in a Generalized Spectral Method for Constrained Clustering
FAST-GE is a generalized spectral method for constrained clustering [Cucuringu et al., AISTATS 2016]. It incorporates the mustlink and cannot-link constraints into two Laplacian matrices and then minimizes a Rayleigh quotient via solving a generalized eigenproblem, and is considered to be simple and scalable. However, there are two unsolved issues. Theoretically, since both Laplacian matrices a...
متن کامل3D Human Posture Segmentation by Constrained Spectral Clustering
In this paper, we propose a new algorithm for partitioning human posture represented by 3D point clouds sampled from the surface of human body. The algorithm is formed as a constrained extension of the recently developed segmentation method, spectral clustering (SC). Two folds of merits are offered by the algorithm: 1) as a nonlinear method, it is able to deal with the situation that data (poin...
متن کاملConstrained Spectral Clustering under a Local Proximity Structure Assumption
This work focuses on incorporating pairwise constraints into a spectral clustering algorithm. A new constrained spectral clustering method is proposed, as well as an active constraint acquisition technique and a heuristic for parameter selection. We demonstrate that our constrained spectral clustering method, CSC, works well when the data exhibits what we term local proximity structure. Empiric...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1601.04746 شماره
صفحات -
تاریخ انتشار 2015